Eligibility Propagation to Speed up Time Hopping for RL Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

نویسندگان

Petar S. Kormushev

Kohei Nomoto

Fangyan Dong

Kaoru Hirota

چکیده

A mechanism called Eligibility Propagation is proposed to speed up the Time Hopping technique used for faster Reinforcement Learning in simulations. Eligibility Propagation provides for Time Hopping similar abilities to what eligibility traces provide for conventional Reinforcement Learning. It propagates values from one state to all of its temporal predecessors using a state transitions graph. Experiments on a simulated biped crawling robot confirm that Eligibility Propagation accelerates the learning process more than 3 times.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

متن کامل

Time Hopping Technique for Reinforcement Learning and its Application to Robot Control

To speed up the convergence of reinforcement learning (RL) algorithms by more efficient use of computer simulations, three algorithmic techniques are proposed: Time Manipulation, Time Hopping, and Eligibility Propagation. They are evaluated on various robot control tasks. The proposed Time Manipulation [1] is a concept of manipulating the time inside a simulation and using it as a tool to speed...

متن کامل

Bidding Strategy on Demand Side Using Eligibility Traces Algorithm

Restructuring in the power industry is followed by splitting different parts and creating a competition between purchasing and selling sections. As a consequence, through an active participation in the energy market, the service provider companies and large consumers create a context for overcoming the problems resulted from lack of demand side participation in the market. The most prominent ch...

متن کامل

Heuristically Accelerated Reinforcement Learning: Theoretical and Experimental Results

Since finding control policies using Reinforcement Learning (RL) can be very time consuming, in recent years several authors have investigated how to speed up RL algorithms by making improved action selections based on heuristics. In this work we present new theoretical results – convergence and a superior limit for value estimation errors – for the class that encompasses all heuristicsbased al...

متن کامل

Investigating Recurrence and Eligibility Traces in Deep Q-Networks

Eligibility traces in reinforcement learning are used as a bias-variance trade-off and can often speed up training time by propagating knowledge back over time-steps in a single update. We investigate the use of eligibility traces in combination with recurrent networks in the Atari domain. We illustrate the benefits of both recurrent nets and eligibility traces in some Atari games, and highligh...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Eligibility Propagation to Speed up Time Hopping for RL Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

نویسندگان

چکیده

منابع مشابه

Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

Time Hopping Technique for Reinforcement Learning and its Application to Robot Control

Bidding Strategy on Demand Side Using Eligibility Traces Algorithm

Heuristically Accelerated Reinforcement Learning: Theoretical and Experimental Results

Investigating Recurrence and Eligibility Traces in Deep Q-Networks

عنوان ژورنال:

اشتراک گذاری